A Graph Theoretical Foundation for Integrating RDF Ontologies
نویسندگان
چکیده
RDF ontologies are rapidly increasing in number. We study the problem of integrating two RDF ontologies under a given set of Horn clauses that specify semantic relationships between terms in the ontology, as well as under a given set of negative constraints. We formally define the notion of a “witness” to the integrability of two RDF ontologies under such constraints. A witness represents a way of integrating the ontologies together. We define “minimal” witnesses and provide the polynomial CROW (Computing RDF Ontology Witness) algorithm to find a witness. We report on the performance of CROW both on DAML, SchemaWeb and OntoBroker ontologies as well as on synthetically generated data. The experiments show that CROW works very well on real-life ontologies ans scales well to massive ontologies. Introduction Since the adoption of “Resource Description Framework” (RDF) as a web recommendation by the World Wide Web Consortium, there has been growing interest in using RDF for expressing ontologies about a diverse variety of topics. As more and more ontologies emerge about the same topics, there is a growing need to integrate these ontologies. There are some initial approaches to merging ontologies in the literature. The initial pioneering work of (Mitra, Wiederhold, & Jannink 1999) showed that ontology merging is an important problem. In another important paper, (Calvanese, Giacomo, & Lenzerini 2001) develop a model theoretic basis for merging ontologies assuming they are in description logic. (Bouquet et al. 2003) extends the syntax of OWL by adding constructs to express relationships between multiple OWL ontologies, but don’t actually tell us how to merge ontologies together using these relationships. (McGuinness et al. 2000) describe a tool called Chimaera that finds taxonomic “areas” for merging as well a a list of similar terms. (Stumme & Maedche 2001) use natural language and formal concept analysis methods to merge onWork supported in part by ARO grant DAAD190310202, ARL grants DAAD190320026 and DAAL0197K0135, NSF grants IIS0329851 and 0205489 and UC Berkeley contract number SA451832441 (subcontract from DARPA’s REAL program). Copyright c 2005, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. tologies using a concept lattice which is explored and transformed by user interactions. Our work is different from the above papers in the following respects: (i) First, we focus on RDF ontologies, (ii) Unlike the work of (Bouquet et al. 2003; McGuinness et al. 2000), we do not focus on finding relationships between terms – but we can use their work as an input to ours, (iii) our algorithms need human intervention only in specifying term relationships once these are known, we can merge ontologies without human input, (iv) our framework allows relationships to be not only between terms, but also allows complex Horn Constraints, (v) Our approach is rooted in the novel concept of an integration witness and in graph theoretic methods and includes correctness and complexity proofs, (vi) we have a prototype implementation that was tested both on large automatically generated synthetic ontologies as well as on real ontologies listed at the DAML, SchemaWeb and OntoBroker sites — the implementation shows that our algorithms scale well. Preliminaries In this section, we provide a very brief overview of the most important construct in RDF and show how RDF documents may be viewed as graphs. An RDF-ontology is a finite set of triples where is a resource name, is a property name, and is a value (which could also be a resource name). RDF-ontologies assume the existence of some set of resource names, some set of property names, and a set ! " # of values associated with any property name . We do not address reification and containers in RDF due to space constraints. Throughout the rest of this paper, we will assume that $ % & ' are all arbitrary, but fixed. Definition 1 (RDF Ontology graph). Suppose ( is an RDF-ontology. An RDF ontology graph for ) is a labeled graph *+ ,/.0 where (1) *213 547698 :<;= > ? @ is the set of nodes. (2) , 1 A= BC ED there exists a property such that F F BG IHJ)LK is the set of edges. (3) .M N BC O1PA JD N BC QHR)LK is the edge labeling function. Figure 1: RDF (respectively OWL) for the two example ontologies It is easy to see that there is a one-one correspondence between RDF-ontologies and RDF-ontology graphs. Given one of them, we can uniquely determine the other. As a consequence, we will often abuse notation and interchangeably talk about both RDF-ontologies and RDF-ontology graphs. Figure 1 shows parts of RDF ontologies 64 and 322 from the DAML web site (www.daml.org) — their graphs are shown in Figure 2. Note that both ontologies have had terms renamed (through the attachment of strings “:1” and “:2” respectively). Thus, STUDENT:2 refers to the student concept in ontology 2. Given two nodes B in an RDF-ontology graph, and a property name , we say that there exists a p-path from to B if there is a path from node to node ’ such that every edge along the path contains in its label. For example, in the second graph of figure 2, there is an S -path from RESEARCHER-IN-ACADEMIA to EMPLOYEE. Here S stands for “subClassOf”. However, there is no T -path from STUDENT:2 to ORGANIZATION-UNIT:2 where T stands for “Affiliate-Of”. Our techniques differentiate between transitive and nontransitive properties. For instance, SUBCLASSOF is a transitive relationship, while AFFILIATEOF as depicted in Figure 2 is not. A cycle involving a transitive relationship could indicate a semantic problem (e.g. U BOSSOF V , V BOSSOF W , W BOSSOF U seems to indicate a problem). However, a cycle involving a non-transitive properties may not be a problem (e.g. U FRIENDOF V , V FRIENDOF W , W FRIENDOF U is not unusual). Definition 2 (Graph embedding) Let XLYZ1[ *#Y ',\Y< '.#Y] and XQ^$1_ `*a^ ,b^< '.0^ be two RDF-ontology graphs. XcY can be embedded into X
منابع مشابه
Integrating Ontologies and Thesauri to Build RDF Schemas
In this paper we present a new approach for building RDF schemas by integrating existing ontologies and structured vocabularies (thesauri). We will present a simple mechanism based on the specification of inclusion relationships between thesaurus terms and ontology concepts and show how these relationships can be exploited to create applicationspecific RDF schemas incorporating the structural v...
متن کاملIntegrating Ontologies and Thesauri to BuildRDF
In this paper we present a new approach for building RDF schemas by integrating existing ontologies and structured vocabularies (thesauri). We will present a simple mechanism based on the speciication of inclusion relationships between thesaurus terms and ontology concepts and show how these relationships can be exploited to create application-speciic RDF schemas incorporating the structural vi...
متن کاملExtending Ontologies with Free Keywords in a Collaborative Annotation Environment
Semantic web technologies have introduced the idea of annotating content in terms of concepts taken from ontologies. Since concepts are defined in terms of properties and relations to other concepts, descriptions grow up into larger RDF graphs that can be used as a basis for data integration and intelligent information retrieval. Since ontologies do not typically contain all the possible concep...
متن کاملSemantic Versioning Manager: Integrating SemVersion in Protégé
Knowledge domains and their semantic representations via ontologies are typically subject to change in practical applications. Additionally, engineering of ontologies often takes place in distributed settings where multiple independent users interact. Therefore, change management for ontologies becomes a crucial aspect for any kind of ontology management environment. We introduce a new RDF-cent...
متن کاملSPARQL2OWL: Towards Bridging the Semantic Gap Between RDF and OWL
Several large databases in biology are now making their information available through the Resource Description Framework (RDF). RDF can be used for large datasets and provides a graph-based semantics. The Web Ontology Language (OWL), another Semantic Web standard, provides a more formal, modeltheoretic semantics. While some approaches combine RDF and OWL, for example for querying, knowledge in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005